智能论文笔记

Transfer Learning of an Ensemble of DNNs for SSVEP BCI Spellers without User-Specific Training

Osman Berke Guney , Huseyin Ozkan

分类：机器学习

2022-09-03

目的：用脑电图（脑电图）测量的稳态视觉诱发电势（SSVEP），在脑部计算机界面（BCI）拼写中产生不错的信息传输速率（ITR）。但是，文献中当前高性能的SSVEP BCI拼写器需要针对每个新用户进行系统适应的最初冗长而累人的用户特定培训，包括使用脑电图实验，算法培训和校准的数据收集（所有这些都是在实际使用之前系统）。这阻碍了BCI的广泛使用。为了确保实用性，我们提出了一种基于深神经网络（DNN）合奏的高度新颖的目标识别方法，该方法不需要任何特定于用户的培训。方法：我们从先前进行的脑电图实验的参与者中利用已经存在的文献数据集来训练全球目标标识符DNN，然后对每个参与者进行微调。我们将这种微调DNN的合奏转移到新的用户实例中，根据参与者与新用户的统计相似性确定k最具代表性的DNN，并通过集合预测的加权组合来预测目标角色。结果：在两个大规模基准和β数据集上，我们的方法可实现令人印象深刻的155.51位/分钟和114.64位/分钟ITR。代码可用于可重复性：https：//github.com/osmanberke/ensemble-fnns结论：拟议的方法在[0.2-1.0]中的所有刺激持续时间上的所有最新替代方案都显着优于[0.2-1.0]秒。两个数据集。意义：我们的合奏-DNN方法有可能在日常生活中促进BCI拼写者的实际广泛部署，因为我们提供了最高的性能，同时无需任何特定于用户的培训即可立即使用。

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs

Osman Ülger , Julian Wiederer , Mohsen Ghafoorian , Vasileios Belagiannis , Pascal Mettes

分类：计算机视觉

2022-12-06

Graph neural networks have shown to learn effective node representations, enabling node-, link-, and graph-level inference. Conventional graph networks assume static relations between nodes, while relations between entities in a video often evolve over time, with nodes entering and exiting dynamically. In such temporally-dynamic graphs, a core problem is inferring the future state of spatio-temporal edges, which can constitute multiple types of relations. To address this problem, we propose MTD-GNN, a graph network for predicting temporally-dynamic edges for multiple types of relations. We propose a factorized spatio-temporal graph attention layer to learn dynamic node representations and present a multi-task edge prediction loss that models multiple relations simultaneously. The proposed architecture operates on top of scene graphs that we obtain from videos through object detection and spatio-temporal linking. Experimental evaluations on ActionGenome and CLEVRER show that modeling multiple relations in our temporally-dynamic graph network can be mutually beneficial, outperforming existing static and spatio-temporal graph neural networks, as well as state-of-the-art predicate classification methods.

translated by 谷歌翻译

Alexa, Let's Work Together: Introducing the First Alexa Prize TaskBot Challenge on Conversational Task Assistance

Anna Gottardi , Osman Ipek , Giuseppe Castellucci , Shui Hu , Lavina Vaz , Yao Lu , Anju Khatri , Anjali Chadha , Desheng Zhang , Sattvik Sahai

分类：自然语言处理 | 人工智能

2022-09-13

自2016年成立以来，Alexa奖计划使数百名大学生能够通过Socialbot Grand Challenge探索和竞争以发展对话代理商。挑战的目的是建立能够与人类在流行主题上连贯而诱人的代理人20分钟，同时达到至少4.0/5.0的平均评分。但是，由于对话代理商试图帮助用户完成日益复杂的任务，因此需要新的对话AI技术和评估平台。成立于2021年的Alexa奖Taskbot Challenge建立在Socialbot Challenge的成功基础上，通过引入交互式协助人类进行现实世界烹饪和做自己动手做的任务的要求，同时同时使用语音和视觉方式。这项挑战要求TaskBots识别和理解用户的需求，识别和集成任务和域知识，并开发新的方式，不分散用户的注意力，而不必分散他们的任务，以及其他挑战。本文概述了Taskbot挑战赛，描述了使用Cobot Toolkit提供给团队提供的基础架构支持，并总结了参与团队以克服研究挑战所采取的方法。最后，它分析了比赛第一年的竞争任务机器人的性能。

translated by 谷歌翻译

Generative Modelling of the Ageing Heart with Cross-Sectional Imaging and Clinical Data

Mengyun Qiao , Berke Doga Basaran , Huaqi Qiu , Shuo Wang , Yi Guo , Yuanyuan Wang , Paul M. Matthews , Daniel Rueckert , Wenjia Bai

分类：计算机视觉 | 机器学习

2022-08-28

心血管疾病是全球死亡的主要原因，是一种与年龄有关的疾病。了解衰老期间心脏的形态和功能变化是一个关键的科学问题，其答案将有助于我们定义心血管疾病的重要危险因素并监测疾病进展。在这项工作中，我们提出了一种新型的条件生成模型，以描述衰老过程中心脏3D解剖学的变化。提出的模型是灵活的，可以将多个临床因素（例如年龄，性别）整合到生成过程中。我们在心脏解剖学的大规模横截面数据集上训练该模型，并在横截面和纵向数据集上进行评估。该模型在预测衰老心脏的纵向演化和对其数据分布进行建模方面表现出了出色的表现。

translated by 谷歌翻译

HTML版本

One Model, Any CSP: Graph Neural Networks as Fast Global Search Heuristics for Constraint Satisfaction

Jan Tönshoff , Berke Kisin , Jakob Lindner , Martin Grohe

分类：人工智能 | 机器学习 | 神经与进化计算

2022-08-22

我们提出了一个通用图形神经网络体系结构，可以作为任何约束满意度问题（CSP）作为末端2端搜索启发式训练。我们的体系结构可以通过政策梯度下降进行无监督的培训，以纯粹的数据驱动方式为任何CSP生成问题的特定启发式方法。该方法基于CSP的新型图表，既是通用又紧凑的，并且使我们能够使用一个GNN处理所有可能的CSP实例，而不管有限的Arity，关系或域大小。与以前的基于RL的方法不同，我们在全局搜索动作空间上运行，并允许我们的GNN在随机搜索的每个步骤中修改任何数量的变量。这使我们的方法能够正确利用GNN的固有并行性。我们进行了彻底的经验评估，从随机数据（包括图形着色，Maxcut，3-SAT和Max-K-Sat）中学习启发式和重要的CSP。我们的方法表现优于先验的神经组合优化的方法。它可以在测试实例上与常规搜索启发式竞争，甚至可以改善几个数量级，结构上比训练中看到的数量级更为复杂。

translated by 谷歌翻译

Subject-Specific Lesion Generation and Pseudo-Healthy Synthesis for Multiple Sclerosis Brain Images

Berke Doga Basaran , Mengyun Qiao , Paul M. Matthews , Wenjia Bai

分类：计算机视觉 | 机器学习

2022-08-03

了解脑损伤的强度特征是定义神经系统研究和预测疾病负担和结局的基于图像的生物标志物的关键。在这项工作中，我们提出了一种基于前景的新型生成方法，用于对局部病变特征进行建模，该方法既可以在健康图像上产生合成病变，又可以从病理图像中综合受试者特异性的伪健康图像。此外，该方法可以用作数据增强模块，以生成用于训练大脑图像分割网络的合成图像。在磁共振成像（MRI）上获得的多发性硬化症（MS）脑图像的实验表明，所提出的方法可以生成高度逼真的伪健康和伪病理学脑图像。与传统的数据增强方法以及最近的病变感知数据增强技术Carvemix相比，使用合成图像进行数据扩展可改善大脑图像分割的性能。该代码将在https://github.com/dogabasaran/lesion-synthesis中发布。

translated by 谷歌翻译

Humans disagree with the IoU for measuring object detector localization error

Ombretta Strafforello , Vanathi Rajasekart , Osman S. Kayhan , Oana Inel , Jan van Gemert

分类：计算机视觉

2022-07-28

自动对象检测器的本地化质量通常通过联合（IOU）分数进行评估。在这项工作中，我们表明人类对本地化质量有不同的看法。为了评估这一点，我们对70多名参与者进行了调查。结果表明，对于以完全相同的评分而言，人类可能不会认为这些错误是相等的，并且表达了偏好。我们的工作是第一个与人类一起评估IOU的工作，并清楚地表明，仅依靠IOU分数来评估本地化错误可能还不够。

translated by 谷歌翻译

Enforcing connectivity of 3D linear structures using their 2D projections

Doruk Oner , Hussein Osman , Mateusz Kozinski , Pascal Fua

分类：计算机视觉

2022-07-14

许多生物学和医疗任务需要描绘出图像体积的3D曲线结构，例如血管和神经突。这通常是使用通过最大程度地减少不捕获这些结构拓扑特性的体素损失函数来训练的神经网络完成的。结果，回收结构的连通性通常是错误的，这减少了它们的实用性。在本文中，我们建议通过最大程度地减少其2D预测的拓扑感知损失的总和来提高结果的3D连接性。这足以提高准确性并减少提供所需的注释培训数据所需的注释工作。

translated by 谷歌翻译

ASL-Homework-RGBD Dataset: An annotated dataset of 45 fluent and non-fluent signers performing American Sign Language homeworks

Saad Hassan , Matthew Seita , Larwan Berke , Yingli Tian , Elaine Gale , Sooyeon Lee , Matt Huenerfauth

分类：自然语言处理

2022-07-08

我们正在使用使用Kinect V2传感器收集的美国手语（ASL）的数据集，该数据集包含包含Fluent和非浮力签名者的视频。该数据集是作为一个项目的一部分收集的，该项目旨在开发和评估计算机视觉算法，以支持新技术以自动检测ASL流利度属性。总共要求45名流利和非全体参与者执行与介绍性或中级ASL课程中使用的作业相似的签名作业作业。注释数据以确定签名的几个方面，包括语法特征和非手动标记。手语识别目前非常数据驱动，该数据集可以支持识别技术的设计，尤其是可以使ASL学习者受益的技术。对于想要对比流利和非流利签名的ASL教育研究人员来说，该数据集也可能很有趣。

translated by 谷歌翻译